AITopics | single sample path

The interval is computed from a single finite-length sample path from the Markov chain, and does not require the knowledge of any parameters of the chain. This stands in contrast to previous approaches, which either only provide point estimates, or require a reset mechanism, or additional prior knowledge.

artificial intelligence, machine learning, markov chain, (17 more...)

Neural Information Processing Systems

Country: North America > Canada > Alberta (0.14)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.77)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.47)

Add feedback

Mixing Time Estimation in Reversible Markov Chains from a Single Sample Path

Neural Information Processing SystemsFeb-7-2025, 10:38:54 GMT

This article provides the first procedure for computing a fully data-dependent interval that traps the mixing time t_{mix} of a finite reversible ergodic Markov chain at a prescribed confidence level. The interval is computed from a single finite-length sample path from the Markov chain, and does not require the knowledge of any parameters of the chain. This stands in contrast to previous approaches, which either only provide point estimates, or require a reset mechanism, or additional prior knowledge. The interval is constructed around the relaxation time t_{relax}, which is strongly related to the mixing time, and the width of the interval converges to zero roughly at a \sqrt{n} rate, where n is the length of the sample path. Upper and lower bounds are given on the number of samples required to achieve constant-factor multiplicative accuracy.

mixing time estimation, reversible markov chain, single sample path, (2 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.90)

Add feedback

Mixing Time Estimation in Reversible Markov Chains from a Single Sample Path

Neural Information Processing SystemsMar-13-2024, 00:44:11 GMT

The interval is computed from a single finite-length sample path from the Markov chain, and does not require the knowledge of any parameters of the chain. This stands in contrast to previous approaches, which either only provide point estimates, or require a reset mechanism, or additional prior knowledge.

artificial intelligence, machine learning, markov chain, (17 more...)

Neural Information Processing Systems

Country: North America > Canada > Alberta (0.14)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.77)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.47)

Add feedback

Oracle-free Reinforcement Learning in Mean-Field Games along a Single Sample Path

Zaman, Muhammad Aneeq uz, Koppel, Alec, Bhatt, Sujay, Başar, Tamer

arXiv.org Artificial IntelligenceApr-11-2023

We consider online reinforcement learning in Mean-Field Games (MFGs). Unlike traditional approaches, we alleviate the need for a mean-field oracle by developing an algorithm that approximates the Mean-Field Equilibrium (MFE) using the single sample path of the generic agent. We call this {\it Sandbox Learning}, as it can be used as a warm-start for any agent learning in a multi-agent non-cooperative setting. We adopt a two time-scale approach in which an online fixed-point recursion for the mean-field operates on a slower time-scale, in tandem with a control policy update on a faster time-scale for the generic agent. Given that the underlying Markov Decision Process (MDP) of the agent is communicating, we provide finite sample convergence guarantees in terms of convergence of the mean-field and control policy to the mean-field equilibrium. The sample complexity of the Sandbox learning algorithm is $\tilde{\mathcal{O}}(\epsilon^{-4})$ where $\epsilon$ is the MFE approximation error. This is similar to works which assume access to oracle. Finally, we empirically demonstrate the effectiveness of the sandbox learning algorithm in diverse scenarios, including those where the MDP does not necessarily have a single communicating class.

artificial intelligence, machine learning, oracle-free reinforcement learning, (14 more...)

arXiv.org Artificial Intelligence

2208.11639

Country:

North America > United States > New York > New York County > New York City (0.04)
North America > United States > Illinois > Champaign County > Urbana (0.04)

Genre:

Research Report (0.40)
Instructional Material (0.34)

Industry:

Energy (0.67)
Leisure & Entertainment (0.45)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.34)

Add feedback

Mixing Time Estimation in Reversible Markov Chains from a Single Sample Path

Hsu, Daniel J., Kontorovich, Aryeh, Szepesvari, Csaba

Neural Information Processing SystemsFeb-14-2020, 09:12:26 GMT

This article provides the first procedure for computing a fully data-dependent interval that traps the mixing time $t_{mix}$ of a finite reversible ergodic Markov chain at a prescribed confidence level. The interval is computed from a single finite-length sample path from the Markov chain, and does not require the knowledge of any parameters of the chain. This stands in contrast to previous approaches, which either only provide point estimates, or require a reset mechanism, or additional prior knowledge. The interval is constructed around the relaxation time $t_{relax}$, which is strongly related to the mixing time, and the width of the interval converges to zero roughly at a $\sqrt{n}$ rate, where $n$ is the length of the sample path. Upper and lower bounds are given on the number of samples required to achieve constant-factor multiplicative accuracy.

mixing time estimation, reversible markov chain, single sample path, (2 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.90)

Add feedback

Mixing Time Estimation in Reversible Markov Chains from a Single Sample Path

Hsu, Daniel J., Kontorovich, Aryeh, Szepesvari, Csaba

Neural Information Processing SystemsDec-31-2015

This article provides the first procedure for computing a fully data-dependent interval that traps the mixing time $t_{mix}$ of a finite reversible ergodic Markov chain at a prescribed confidence level. The interval is computed from a single finite-length sample path from the Markov chain, and does not require the knowledge of any parameters of the chain. This stands in contrast to previous approaches, which either only provide point estimates, or require a reset mechanism, or additional prior knowledge. The interval is constructed around the relaxation time $t_{relax}$, which is strongly related to the mixing time, and the width of the interval converges to zero roughly at a $\sqrt{n}$ rate, where $n$ is the length of the sample path. Upper and lower bounds are given on the number of samples required to achieve constant-factor multiplicative accuracy. The lower bounds indicate that, unless further restrictions are placed on the chain, no procedure can achieve this accuracy level before seeing each state at least $\Omega(t_{relax})$ times on the average. Finally, future directions of research are identified.

artificial intelligence, machine learning, markov chain, (17 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.97)

Add feedback